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Abstract 

Large scale tensors, including large scale Hankel tensors, have many applications 
in science and engineering. In this paper, we propose an inexact curvilinear search 
optimization method to compute Z- and H-eigenvalues of mth order n dimensional 
Hankel tensors, where n is large. Owing to the fast Fourier transform, the computa¬ 
tional cost of each iteration of the new method is about 0{mn\og{mn)). Using the 
Cayley transform, we obtain an effective curvilinear search scheme. Then, we show 
that every limiting point of iterates generated by the new algorithm is an eigen-pair 
of Hankel tensors. Without the assumption of a second-order sufficient condition, we 
analyze the linear convergence rate of iterate sequence by the Kurdyka-Lojasiewicz 
property. Finally, numerical experiments for Hankel tensors, whose dimension may up 
to one million, are reported to show the efficiency of the proposed curvilinear search 
method. 
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1 Introduction 


With the coming era of massive data, large scale tensors have important applications in 
science and engineering. How to store and analyze these tensors? This is a pressing and 
challenging problem. In the literature, there are two strategies for manipulating large scale 
tensors. The hrst one is to exploit their structures such as sparsity [3]. For example, 
we consider an online store (e.g. Amazon.com) where users may review various products 
[35] . Then, a third order tensor with modes: users, items, and words could be formed 
naturally and it is sparse. The other one is to use distributed and parallel computation 
[T6l [12]. This technique could deal with large scale dense tensors, but it depends on a 
supercomputer. Recently, researchers applied these two strategies simultaneously for large 
scale tensors [SSlill]. 

In this paper, we consider a class of large scale dense tensors with a special Hankel struc¬ 
ture. Hankel tensors appear in many engineering problems such as signal processing mm, 
automatic control [H], and geophysics [391!^ . For instance, in nuclear magnetic resonance 
spectroscopy [51], a Hankel matrix was formed to analyze the time-domain signals, which is 
important for brain tumour detection. Papy et ah mm improved this method by using 
a high order Hankel tensor to replace the Hankel matrix. Ding et ah [H] proposed a fast 
computational framework for products of a Hankel tensor and vectors. On the mathematical 
properties, Luque and Thibon [3l] explored the Hankel hyperdeterminants. Qi [13] and Xu 
[53] studied the spectra of Hankel tensors and gave some upper bounds and lower bounds 
for the smallest and the largest eigenvalues. In |13] , Qi raised a question: Can we construct 
some efficient algorithms for the largest and the smallest H- and Z-eigenvalues of a Hankel 
tensor? 

Numerous applications of the eigenvalues of higher order tensors have been found in 
science and engineering, such as automatic control [3Z], medical imaging [mi^ lU]. quantum 
information [36], and spectral graph theory [13]. For example, in magnetic resonance imaging 
[45] . the principal Z-eigenvalues of an even order tensor associated to the hber orientation 
distribution of a voxel in white matter of human brain denote volume factions of several 
nerve hbers in this voxel, and the corresponding Z-eigenvectors express the orientations 
of these nerve hbers. The smallest eigenvalue of tensors rehects the stability of a nonlinear 
multivariate autonomous system in automatic control [37]. For a given even order symmetric 
tensor, it is positive semidehnite if and only if its smallest H- or Z-eigenvalue is nonnegative 
021 . 

The conception of eigenvalues of higher order tensors was dehned independently by Qi 
[42] and Lim [32] in 2005. Unfortunately, it is an NP-hard problem to compute eigenvalues 
of a tensor even though the involved tensor is symmetric [26] . For two and three dimensional 
symmetric tensors, Qi et ah [44] proposed a direct method to compute all of its Z-eigenvalues. 
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It was pointed out in [301 El] that the polynomial system solver, NSolve in Mathematica, 
could be used to compute all of the eigenvalues of lower order and low dimensional tensors. 
We note that the mathematical software Maple has a similar command solve which is also 
applicable for the polynomial systems of eigenvalues of tensors. 

For general symmetric tensors, Kolda and Mayo [30] proposed a shifted symmetric higher 
order power method to compute its Z-eigenpairs. Recently, they [31] extended the shifted 
power method to generalized eigenpairs of tensors and gave an adaptive shift. Based on the 
nonlinear optimization model with a compact unit spherical constraint, the power methods 
El project the gradient of the objective at the current iterate onto the unit sphere at each 
iteration. Its computation is very simple but may not converge [29] . Kolda and Mayo [SO] [31] 
introduced a shift to force the objective to be (locally) concave/convex. Then the power 
method produces increasing/decreasing steps for computing maximal/minimal eigenvalues. 
The sequence of objectives converges to eigenvalues since the feasible region is compact. The 
convergence of the sequence of iterates to eigenvectors is established under the assumption 
that the tensor has hnitely many real eigenvectors. The linear convergence rate is estimated 
by a hxed-point analysis. 

Inspired by the power method, various optimization methods have been established. Han 
[23] proposed an unconstrained optimization model, which is indeed a quadratic penalty 
function of the constrained optimization for generalized eigenvalues of symmetric tensors. 
Hao et al. [23] employed a subspace projection method for Z-eigenvalues of symmetric 
tensors. Restricted by a unit spherical constraint, this method minimizes the objective in a 
big circle of n dimensional unit sphere at each iteration. Since the objective is a homogeneous 
polynomial, the minimization of the subproblem has a closed-form solution. Additionally, 
Hao et al. [25] gave a trust region method to calculate Z-eigenvalues of symmetric tensors. 
The sequence of iterates generated by this method converges to a second order critical point 
and enjoys a locally quadratic convergence rate. 

Since nonlinear optimization methods may produce a local minimizer, some convex op¬ 
timization models have been studied. Hu et al. [27] addressed a sequential semi-dehnite 
programming method to compute the extremal Z-eigenvalues of tensors. A sophisticated 
Jacobian semi-dehnite relaxation method was explored by Cui et al. HI. A remarkable 
feature of this method is the ability to compute all of the real eigenvalues of symmetric 
tensors. Recently, Chen et al. [8] proposed homotopy continuation methods to compute all 
of the complex eigenvalues of tensors. When the order or the dimension of a tensor grows 
larger, the CPU times of these methods become longer and longer. 

In some applications [571139] , the scales of Hankel tensors can be quite huge. This highly 
restricted the applications of the above mentioned methods in this case. How to compute 
the smallest and the largest eigenvalues of a Hankel tensor? Can we propose a method to 
compute the smallest and the largest eigenvalues of a relatively large Hankel tensor, say 
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1, 000, 000 dimension? This is one of the motivations of this paper. 

Owing to the multi-linearity of tensors, we model the problem of eigenvalues of Hankel 
tensors as a nonlinear optimization problem with a unit spherical constraint. Our algorithm 
is an inexact steepest descent method on the unit sphere. To preserve iterates on the unit 
sphere, we employ the Cayley transform to generate an orthogonal matrix such that the 
new iterate is this orthogonal matrix times the current iterate. By the Sherman-Morrison- 
Woodbury formula, the product of the orthogonal matrix and a vector has a closed-form 
solution. So the subproblem is straightforward. A curvilinear search is employed to guar¬ 
antee the convergence. Then, we prove that every accumulation point of the sequence of 
iterates is an eigenvector of the involved Hankel tensor, and its objective is the corresponding 
eigenvalue. Furthermore, using the Kurdyka-Lojasiewicz property of the eigen-problem of 
tensors, we prove that the sequence of iterates converges without an assumption of second 
order sufficient condition. Under mild conditions, we show that the sequence of iterates has 
a linear or a sublinear convergence rate. Numerical experiments show that this strategy is 
successful. 

The outline of this paper is drawn as follows. We introduce a fast computational frame¬ 
work for products of a well-structured Hankel tensor and vectors in Section 2. The compu¬ 
tational cost is cheap. In Section 3, we show the techniques of using the Cayley transform 
to construct an effective curvilinear search algorithm. The convergence of objective and 
iterates are analyzed in Section 4. The Kurdyka-Lojasiewicz property is applied to analyze 
an inexact line search method. Numerical experiments in Section 5 address that the new 
method is efficient and promising. Finally, we conclude the paper with Section 6. 

2 Hankel tensors 

Suppose A is an mth order n dimensional real symmetric tensor 

A= ioiij = l,...,n,j = l,...,m, 

where all of the entries are real and invariant under any index permutation. Two products 
of the tensor A and a column vector x G M"" used in this paper are dehned as follows. 

• Ax^ is a scalar 

n 

Ax.^= 

i\ ,. ..,^771 = 1 

• Ax.^~^ is a column vector 

n 

^ for i = l,...,n. 
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When the tensor A is dense, the computations of produces Ax"^ and Ax™'~^ require 0{n^) 
operations, since the tensor A has entries and we must visit all of them in the process 
of calculation. When the tensor is symmetric, the computational cost for these products is 
about 0{'nA/m\) [IB]. Obviously, they are expensive. In this section, we will study a special 
tensor, the Hankel tensor, whose elements are completely determined by a generating vector. 
So there exists a fast algorithm to compute products of a Hankel tensor and vectors. Let us 
give the definitions of two structured tensors. 

Definition 1 An mth order n dimensional tensor l-i is called a Hankel tensor if its entries 
satisfy 

- him—rm for ij = 1, ... ,77-, J = 1, . . . ,171. 

The vector v = (no,ni,... ,Vm(n-i))~^ with length i = m{n — 1) + 1 is called the generating 
vector of the Hankel tensor TL. 

An mth order ^ dimensional tensor C is called an anti-circulant tensor if its entries 
satisfy 

Cii,i 2 ,...,*m “ '^(*i+* 2 H Hm—m mod t)i for ij = 1, j = 1, , m. 

It is easy to see that H is a sub-tensor of C. Since for the same generating vector v we 
have 

Cil,i2,...,*m L ' ' ' 1 j 1, . . . ,777. 

For example, a third order two dimensional Hankel tensor with a generating vector v = 
{vo,Vi,V 2 ,V 3 y is 


Vo 

Vi 

Vl 

V2 

Vi 

V2 

V2 

V3 


It is a sub-tensor of an anti-circulant tensor with the same order and a larger dimension 


Vo 

Vl 

V2 

V3 

Vl 

V2 

V3 

Vo 

V2 

V3 

Vo 

Vl 

V3 

Vo 

Vl 

V2 ' 

Vl 

V2 

V3 

Vo 

V2 

V3 

Vo 

Vl 

V3 

Vo 

Vl 

V2 

Vo 

Vl 

V2 

V3 

V2 

V3 

Vo 

Vl 

V3 

Vo 

Vl 

V2 

Vo 

Vl 

V2 

V3 

Vl 

V2 

V3 

Vo 

. V3 

Vo 

Vl 

V2 

Vo 

Vl 

V2 

V3 

Vl 

V2 

V3 

Vo 

V2 

V3 

Vo 

Vl . 


As discovered in (TS] Theorem 3.1], the 777 th order i dimensional anti-circulant tensor C 
could be diagonalized by the i-hj-i Fourier matrix F^, i.e., C = FFp, where P is a diagonal 
tensor whose diagonal entries are diag('D) = Ff^\. It is well-known that the computations 
involving the Fourier matrix and its inverse times a vector are indeed the fast (inverse) 
Fourier transform fft and if ft, respectively. The computational cost is about 0{£\ogt) 
multiplications, which is significantly smaller than 0{£^) for a dense matrix times a vector 
when the dimension £ is large. 
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Now, we are ready to show how to compute the products introduced in the beginning of 
this section, when the involved tensor has a Hankel structure. For any x G M”', we dehne 
another vector y G such that 


y = 


X 

Of—n 


where ^ = m{n — 1) + 1 and Qi-n is a zero vector with length i — n. Then, we have 


^x™ = = V{Feyr = ifft(v)^ (fft(y)°™). 


To obtain "Hx"* we hrst compute 

Cy'”-' = Fe (P(F,y)™-i) = fft (ifft(v) o (fft(y)°(™-0)) . 

Then, the entries of vector is the leading n entries of Cy”^“^. Here, o denotes the 

Hadamard product such that {AoB)ij = Three matrices A, B and AoB have the 

same size. Furthermore, we dehne A°^ = A o ■■■ o A as the Hadamard product of k copies 
of H. 

Since the computations of 'Hx™' and require 2 and 3 fft/iffts, the computational 

cost is about Cl(mnlog(mn)) and obviously cheap. Another advantage of this approach is 
that we do not need to store and deal with the tremendous Hankel tensor explicitly. It is 
sufficient to keep and work with the compact generating vector of that Hankel tensor. 


3 A curvilinear search algorithm 


We consider the generalized eigenvalue mm of an mth order n dimensional Hankel tensor 

n 




where m is even, B is an mth order n dimensional symmetric tensor and it is positive dehnite. 
If there is a scalar A and a real vector x satisfying this system, we call A a generalized 
eigenvalue and x its associated generalized eigenvector. Particularly, we hnd the following 
dehnitions from the literature, where the computation on the tensor B is straightforward. 


• Qi |l2] called a real scalar A a Z-eigenvalue of a tensor "H and a real vector x its 
associated Z-eigenvector if they satisfy 


= Ax and x^x = 1. 


This dehnition means that the tensor B is an identity tensor S such that £^x™ ^ = 


ixir-2x. 
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• \iB = T, where 


1 if *1 = • • • = i 
0 otherwise , 






m 


mi 


the real scalar A is called an H-eigenvalue and the real vector x is its associated Id- 
eigenvector |12]. Obviously, we have for i = 1 ,..., n. 


To compute a generalized eigenvalue and its associated eigenvector, we consider the 
following optimization model with a spherical constraint 


min /(x) = s.t. ||x|| = 1, (1) 

where || • || denotes the Euclidean norm or its induced matrix norm. The denominator of the 
objective is positive since the tensor B is positive dehnite. By some calculations, we get its 
gradient and Hessian, which are formally presented in the following lemma. 


Lemma 1 Suppose that the objective is defined as in m- Then, its gradient is 

g(x) = . 

^ ' Bx^ V -Sx”* J 

And its Hessian is 

m{m — l)'Hx’^~^ m{m — l)'HxHBx^~‘^ + m^('Hx™'“^ @ 

Hx™ {Bx^y 

nH'Hx^{Bx^~^ @ Bx^~y 

{Bx^y 

where x@ y = xy~’' -|- yx"*". 

Let = {x e R” I x^x = 1} be the spherical feasible region. Suppose the current 
iterate is x e and the gradient at x is g(x). Because 

-"g(-) = £, = 0. ('i) 

the gradient g(x) of x G S„_i is located in the tangent plane of §„_i at x. 

Lemma 2 Suppose ||g(x)|| = e, where x G and e is a small number. Denote X = 

Then, we have 

ll-Hx^^-^-Anx^-^ll = 0(e). 

Moreover, if the gradient g(x) at x vanishes, then A = /(x) is a generalized eigenvalue and 
X is its associated generalized eigenvector. 


( 2 ) 


(3) 
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Proof Recalling the definition of gradient ([2]), we have 


ll-Hx 


m—1 


ASx 


m—1 1 


m 


-e. 


Since the tensor B is positive definite and the vector x belongs to a compact set -Bx”^ 
has a finite upper bound. Thus, the first assertion is valid. 

If e = 0, we immediately know that A = /(x) is a generalized eigenvalue and x is its 
associated generalized eigenvector. □ 

Next, we construct the curvilinear search path using the Cayley transform [22]. Cayley 
transform is an effective method which could preserve the orthogonal constraints. It has 
various applications in the inverse eigenvalue problem [20], p-harmonic flow [^, and matrix 
optimization |52j . 

Suppose the current iterate is x^ G Sn-i and the next iterate is x^+i. To preserve the 
spherical constraint x^^_;^Xfc_|_i = x^x^ = 1, we choose the next iterate Xfc+i such that 


Xfc+l Q^ki 


(5) 


where Q G is an orthogonal matrix, whose eigenvalues do not contain —1. Using the 

Cayley transform, the matrix 

Q = {I + W)-\I -W) (6) 

is orthogonal if and only if the matrix W G is skew-symmetric 0 Now, our task is to 

select a suitable skew-symmetric matrix W such that g(xfc)''^(xfc+i — x^) < 0. For simplicity, 
we take the matrix W as 

W = ab'^ — ba"^, (7) 

where a, b G M” are two undetermined vectors. From ([5]) and ([6]), we have 

Xfc+i - Xfc = -W (xfc -F Xfc+i). 

Then, by ((71), it yields that 

g(xfc)’^(xfc+i - Xfc) = -[(g(xfc)’^a)b^ - (g(xfc)’^b)a’^](xfc -F Xfc+i). 

For convenience, we choose 


a = Xfc and b = —ag(xfc). (8) 

Here, a is a positive parameter, which serves as a step size, so that we have some freedom 
to choose the next iterate. According to this selection and (jl]), we obtain 

g(xfc)"^(xfc+i - Xfc) = -a||g(xfc)||^xj(xfc-FXfc+i) 

= -«l|g(xfc)||^(l TxjQxfc). 

^See ‘ http://en.wikipedia.org/wiki/Cayley_transform ’. 





Since —1 is not an eigenvalue of the orthogonal matrix Q, we have 1 + > 0 for 

x^Xfc = 1. Therefore, the conclusion g(xfc)^(xfc+i — x^) < 0 holds for any positive step size 
a. 

We summarize the iterative process in the following Theorem. 


Theorem 1 Suppose that the new iterate x^+i is generated by and Then, 

the following assertions hold. 

• The iterative scheme is 


Xfc+i(a) 


l-a^||g(x,)|p 

1 + a2||g(xfc)||2 ^ 


2a 

1 + a2||g(xfc)||2 


g(Xfc). 


• The progress made by Xfc+i is 


g(xfc)"^(xfc+i(a) - Xfc) 


2c^||g(x,)||^ 

1 + Q;2||g(x;.)||2‘ 


(9) 


( 10 ) 


Proof From the equality (|4)) and the Sherman-Morrison-Woodbury formula, we have 


Xfc+i(a) 


(/ - axfcg(xfc)’^ -F ag(xfc)xfc ) ^(J -F axfcg(xfc)’^ - ag(xfc)xfc )xfc 

(/ ag(xfc)xj - axfcg(xfc)’^)"^(xfc - ag(xk)) 



ag(xk) -Xfc 


1 0 
0 1 


+ 




ag(xfc) 


T 


ag(xfc) -Xfc 



ag(xfc)’^ 


(Xfc - Q!g(Xfc)) 




1 -1 

-1 

1 

Xfc - ag(xfc) - 

ag(xfc) -Xfc 

«^l|g(x.)|P 1 _ 


-«^l|g(x.)||^ _ 


l-0^||g(Xfc)||^ 

l + a2||g(xfc)||2 ^ 


2a 

l + a2||g(Xfc)||2 


g(Xfc). 


The proof of flTOl) is straightforward. □ 

Whereafter, we devote to choose a suitable step size a by an inexact curvilinear search. 
At the beginning, we give a useful theorem. 


Theorem 2 Suppose that the new iterate Xfc+i(o) is generated by Then, we have 


d/(xfc+i(a)) 


da 


Q = 0 


-2||g(xfc)|r 
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Algorithm 1 A curvilinear search algorithm (ACSA). 

1: Give the generating vector v of a Hankel tensor Ti, the symmetric tensor B, an initial 
unit iterate xi, parameters 7] G (0, |], 13 G (0,1), hi = 1 < Omax, and k ^ 1. 

2: while the sequence of iterates does not converge do 

3: Compute "Hx™ and by the fast computational framework introduces in Section 

2 . 

4: Calculate ;Bx^, \k = /(x^) = and g(xfc) by (E]). 

5: Choose the smallest nonnegative integer £ and determine ak = l3^o.k such that 

/(xfc+i(afc)) < /(xfc) - r7afc||g(xfc)|p, (11) 

where Xfc+i(a) is calculated by ([H])- 
6 : Update the iterate x^+i = Xfc+i(Q!fc). 

7: Choose an initial step size o.k+i G (0, Omax] for the next iteration. 

8: /c <(— /c + 1. 

9: end while 


Proof By some calculations, we get 



-2 

1 + a2||g(xfc)||2 


g(Xfc) + 


-4a||g(xfc)|P 

(l+a2||g(x,)||2)2 


(Xfc - Q!g(Xfc)). 


Hence, x)._,_;^(0) = —2g(xfc). Furthermore, Xfc+i(0) = x^. Therefore, we obtain 


= g(xfc+i(0))Vfc+i(a) = g(xfc)^(-2g(xfc)) = -2||g(xfc)||l 

The proof is completed. □ 

According to Theorem [2], for any constant rj G (0, 2), there exists a positive scalar a such 
that for all a G (0, a], 

/(xfc+i(a)) - /(xfc) < -r/a||g(xfc)|p. 


d/(Xfc+i(Q)) 

da 


Hence, the curvilinear search process is well-dehned. 

Now, we present a curvilinear search algorithm (ACSA) formally in Algorithm [1] for the 
smallest generalized eigenvalue and its associated eigenvector of a Hankel tensor. If our aim 
is to compute the largest generalized eigenvalue and its associated eigenvector of a Hankel 
tensor, we only need to change respectively ([9]) and (fTTD used in Steps 5 and 6 of the ACSA 
algorithm to 


Xfc+i(a) 


l-a^||g(x,)|p 

1 + a2||g(Xfe)||2 ^ 


2a 

1 + a2||g(Xfe)||2 


g(Xfc), 


and 

/(xfc+i(afc)) > /(xfc) +?7afc||g(xfc)|p. 


10 












When the Z-eigenvalue of a Hankel tensor is considered, we have Sx.^ = llxH™ = 1 and 
the objective /(x) is a polynomial. Then, we could compute the global minimizer of the step 
size ak (the exact line search) in each iteration as |23]. However, we use a cheaper inexact 
line search here. The initial step size of the next iteration follows Dai’s strategy [T3] 

ll^Xfcll 

ttfc+l — TTr-[7; 

which is the geometric mean of Barzilai-Borwein step sizes |1]. 

4 Convergence analysis 

Since the optimization model ([T]) has a nice algebraic nature, we will use the Kurdyka- 
Lojasiewicz property [331 E] to analyze the convergence of the proposed ACSA algorithm. 
Before we start, we give some basic convergence results. 

4.1 Basic convergence results 

If the ACSA algorithm terminates finitely, there exists a positive integer k such that g(xfc) = 
0. According to Lemma [21 /(x^) is a generalized eigenvalue and x^ is its associated gener¬ 
alized eigenvector. 

Next, we assume that ACSA generates an infinitely sequence of iterates. 

Lemma 3 Suppose that the even order symmetric tensor B is positive definite. Then, all 
the functions, gradients, and Hessians of the objective m at feasible points are bounded. 
That is to say, there is a positive constant M such that for all x G §n-i 

|/(x)|<M, ||g(x)||<M, and ||Tr(x)|| < M. (12) 

Proof Since the spherical feasible region §n-i is compact, the denominator Hx™ of the 
objective is positive and bounds away from zero. Recalling Lemma [H we get this theorem 
immediately. □ 

Theorem 3 Suppose that the infinite sequence {Afc} is generated by ACSA. Then, the se¬ 
quence {Afc} is monotonously decreasing. And there exists a A* such that 

lim Afc = A*. 

fc—>-oo 

Proof Since Xk = /(x^) which is bounded and monotonously decreasing, the infinite se¬ 
quence {Afc} must converge to a unique A*. □ 

This theorem means that the sequence of generalized eigenvalues converges. To show the 
convergence of iterates, we first prove that the step sizes bound away from zero. 
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Lemma 4 Suppose that the step size au is generated by ACS A. Then, for all iterations k, 
we get 

(0 -r,\R 

(13) 


(2 - n)l3 

^ = ®min ^ 0. 


5M 

Proof Let a = According to the curvilinear search process of ACSA, it is sufficient 

to prove that the inequality flTTl) holds if ak G (0,a]. 

From the iterative formula ([9]) and the equality (jl]), we get 

2 


||Xfc+i(Q!) -XfclP = 


- 2 «^l|g(x.)|p 


2a 


rXfc - 


1 + a2||g(xfc)||2 1 + a2||g(xfc)||2 

4a^||g(x,)r||x,||2 + 4a2||g(x,)||2 

(l+a2||g(x,)||2)2 

4«^l|g(x.)|P 


g(Xfc) 


l + a2||g(Xfc)||2' 


Hence, 


||xfc+i(Q;) - Xfcll = 

From the mean value theorem, fl9l) . 


2a||g(xfc)|| 


v/1 + a2||g(xfc)||2 

, and flT^ . we have 


/(xfc+i(a)) - /(xfc) < g(xfc)’^(xfc+i(a) - Xfc) + -M||xfc+i(a) - Xfc| 


< 


l + «^l|g(x.)P 

«l|g(Xfc)|P 


M 


-2a ||g(xfc)|| g(xfc) Xfc-2a||g(xfc)|| + Y4a ||g(xfc)||- 


(4aM - 2). 


1 + a2||g(Xfc)||2 

It is easy to show that for all a G (0, a] 

4aM — 2 < —ri{l + 

Therefore, we have 


/(xfc+,(a)) - /(Xfc) < J ^2]g(^^)||2 «llg(^^)ll' < -Vc^M^kW- 


(14) 


The proof is completed. 


□ 


Theorem 4 Suppose that the infinite seguence {xfc} is generated by ACSA. Then, the se- 
guence {xfc} has an accumulation point at least. And we have 

lim ||g(xfc)|| = 0. (15) 

k^oo 

That is to say, every accumulation point o/{xfc} is a generalized eigenvector whose associated 
generalized eigenvalue is A*. 
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Proof Since the sequence of objectives {/(x^)} is monotonously decreasing and bounded, 
by (mD and ([I3]), we have 

OO OO OO 

2M > /(xi) - A* = E /(xfc) - /(xfc+i) > ^r7afc||g(xfc)||^ > 5 ^l|g(xA:)ir- 

k=l k=l k=l 

It yields that 

X—^ II / \ii9 ‘2M 

V||g(xfc)||^< - <+ 00 . 16 

^ rjarain 

Thus, the limit (1T5|) holds. 

Let Xoo be an accumulation point of {x^}. Then Xqo belongs to the compact set §„_i 
and ||g(xoo)|| = 0. According to LemmaO Xoo is a generalized eigenvector whose associated 
eigenvalue is /(xoo) = A*. □ 


4.2 Further results based on the Kurdyka-Lojasiewicz property 

In this subsection, we will prove that the iterates {x^} generated by ACS A converge without 
an assumption of the second-order sufficient condition. The key tool of our analysis is the 
Kurdyka-Lojasiewicz property. This property was hrst discovered by S. Lojasiewicz [33] 
in 1963 for real-analytic functions. Bolte et ah [5] extended this property to nonsmooth 
subanalytic functions. Whereafter, the Kurdyka-Lojasiewicz property was widely applied to 
analyze regularized algorithms for nonconvex optimization HiEg. Signihcantly, it seems to 
be new to use the Kurdyka-Lojasiewicz property to analyze an inexact line search algorithm, 
e.g., ACSA proposed in Section 3. 

We now write down the Kurdyka-Lojasiewicz property [5] Theorem 3.1] for completeness. 


Theorem 5 (Kurdyka-Lojasiewicz (KL) property) Suppose that x* is a critical point 
of /{x.). Then there is a neighborhood ^ an exponent 9 G [0,1), and a constant Ci 

such that for all x E ^, the following ineguality holds 


Here, we define 0° = 1. 


l/(x) - /(xdl^ 

l|g(x)|| 


ECi. 


(17) 


Lemma 5 Suppose that x* is one of the accumulation points of {xj.}. For the convenience 
of using the Kurdyka-Lojasiewicz property, we assume that the initial iterate xi satisfies 
Xi e SS{x„,,p) = {x G M” I ||x —x*|| < p} C ^ where 

‘IF 

p > ^(1 ~ + l|xi -x*||. 
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Then, we have the following two assertions: 


Xfce^(x*,p), V/c = l,2,..., (18) 

and 

^ iixfc+i - Xfcii < 1/(^0 - (19) 

Proof We prove ffTSl) by the induction. First, it is easy to see that xi G <^(x*,p). Next, we 
assume that there is an integer K such that 


Xfc G <^(x*,p), V 1 < A; < JF. 


Hence, the KL property (fT7)) holds in these iterates. Finally, we now prove that x^+i G 

e^(x*,p). 

For the convenience of presentation, we dehne a scalar function 

‘A'(s) = Y^\s - ■ 

Obviously, <p(s) is a concave function and its derivative is <p'(s) = )p ^1 ^ > /(x.). 

Then, for any 1 < k < K, we have 


F(/(xfc)) - (p(/(xfc+i)) > 

[by KL property] > 

[since flTTl) ] > 

> 

[because of fflTl) ] > 

It yields that 


¥^'(/(Xfc))(/(Xfc) - /(Xfc+i)) 

c 

|/(xt)-/(x.)|»(^(’"‘) ■ 
^(/(xO-/(x.«)) 

pafc||g(xfc)|| 

^l + a2||g(xfc)||2 

^||Xfc+l -Xfc||. 


K 

iixfc+i - Xfc 

k=l 


< -^TUi^k)) - T{f{^k+i)) 

= ^(</?(/(xi)) -<p(/(Xi^+i))) 

< -(p(/(xi)). 

V 


( 20 ) 
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So, we get 


K 

||xx+l-X*|| < ^ ||Xfc+i - Xfcll + ||xi - x*|| 

fc=l 
2 

< -</?(/(xi)) + ||xi - x*|| 

V 

< P- 

Thus, Xi^+i G ^(x*,p) and ffTSl) holds. 

Moreover, let K —)■ oo in (|20|) . We obtain ([19]). □ 

Theorem 6 Suppose that the infinite sequence of iterates {x^} is generated by ACS A. Then, 
the total sequence {x^} has a finite length, i.e., 

^ ||Xfc+i - Xfcll < +00, 

k 

and hence the total sequence {x^} converges to a unique critical point. 

Proof Since the domain of /(x) is compact, the infinite sequence {x*,} generated by ACSA 
must have an accumulation point x*. According to Theorem H] x* is a critical point. Hence, 
there exists an index /cq, which could be viewed as an initial iteration when we use Lemma 
(5] such that Xfc^ G ^(x*,p). From Lemma[5l we have YlT=ko ll^fc+i — ^k\\ < +oo. Therefore, 
the total sequence {x^} has a finite length and converges to a unique critical point. □ 


Lemma 6 There exists a positive constant C 2 such that 


||Xfc+i - Xfcll > C2||g(Xfc)||. 


( 21 ) 


Proof Since Or 


fC >ak>a 
||Xfc+i - Xfcll = 


> 0 and o. we have 

2afc||g(xfc)|| 


Let Co = 


— 2ar, 


1 + tin 


\/l + «fcl|g(Xfc)||^ 
We get this lemma. 

If iW 


> 


‘20iin 


1 -|- Q^max-'^ 


l|g(Xfc)||. 


□ 


Theorem 7 Suppose that x* is the critical point of the infinite sequence of iterates {xfc} 
generated by ACSA. Then, we have the following estimations. 

• If 9 E (0, i], there exists a 7 > 0 and g G (0,1) such that 

||Xfc - x*|| < 7/. 


If 9 E (1,1), there exists a 7 > 0 such that 


|xfc — x*|| < 'jk~ 
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Proof Without loss of generality, we assume that xi G ^(x*,p). For convenience of follow¬ 
ing analysis, we dehne 

OO 

Afc = ^ ||xi - Xj+i|| > ||xfc - x*||. 


i=k 


Then, we have 


Ak - 


|Xi - Xi+i| 


i=k 


[since (IT^ ] < 


[KL property] < 


2Ci 


-|/(Xfc) - /(x*)| 


i-e 


v{i-o) 
-^^^(|/(x,)-/(x,)|®)^ 

(C'illg(xfc)ll)^ 


v{i-o) 

2C 

[for dUD] < (CiC 2 "^||xfc - Xfc+i 


v{i-o) 

1 1-6 

2CfC^ ^ 


(A,-A 


k+lj 


( 22 ) 


7]{i-e) 

= C'a (Afc — Afc+i) ® , 

where is a positive constant. 

If 0 G (0, |), we have > 1. When the iteration k is large enough, the inequality 
implies that 

Afc < C3(Afc — Afc+i). 

That is 

A / - 1 ^ 

^k+1 S — -p:; ‘^k- 

L'S 

Hence, recalling ||xfc — x*|| < A^, we obtain the estimation if we take g = . 

Otherwise, we consider the case 9 G Let h{s) = s~^. Obviously, h{s) is 

monotonously decreasing. Then, the inequality fl22]) could be rewritten as 

< h(Afc)(Afc-A,+i) 

= / h(A) ds 

^k + l 
/•Afe 

h{s) ds 

^fe+i 


< 


1 _ fj 29-1 29-1 

-(A, - A, 


“ 29 

Denote v = < 0 since 9 G (|, !)• Then, we get 


^k+l 


Al^,-Al>uC-^-^ ^C,>0. 
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It yields that for all > /c, 


<[^K + CA{k - K)]^ , 

where the last inequality holds when the iteration k is sufficiently large. □ 

We remark that if the Hessian ih(x*) at the critical point x* is positive dehnite, the 
key parameter 6 in the Kurdyka-Lojasiewicz property is 6^ = |. Under Theorem [TJ the 
sequence of iterates generated by ACS A has a linear convergence rate. In this viewpoint, 
the Kurdyka-Lojasiewicz property is weaker than the second order sufficient condition of x* 
being a minimizer. 

5 Numerical experiments 

To show the efficiency of the proposed ACSA algorithm, we perform some numerical exper¬ 
iments. The parameters used in ACSA are 

r] = .001, P = .5, Omax = 10000. 

We terminate the algorithm if the objectives satisfy 

max(l, |Afc|) 

or the number of iterations exceeds 1000. The codes are written in MATLAB R2012a and 
run in a desktop computer with Intel Core E8500 CPU at 3.17GHz and 4GB memory running 
Windows 7. 

We will compare the following four algorithms in this section. 

• An adaptive shifted power method [30l 131] (Power M.) is implemented as eig_sshopni 
and eig_geap in Tensor Toolbox 2.6 for Z- and H-eigenvalues of even order symmetric 
tensors. 

• An unconstrained optimization approach [23] (Han’s UOA) is solved by fminunc in 
MATLAB with settings: GradObj : on, LargeScale: of f, TolX:l.e-10, TolFun: 1. e-8, 
Maxiter:10000, Display:off. 

• For general symmetric tensors without considering a Hankel structure, we implement 
ACSA as ACSA-general. 

• The ACSA algorithm (ACSA-Hankel) is proposed in Section 3 for Hankel tensors. 


< 10-12 
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5.1 Small Hankel tensors 

First, we examine some small tensors, whose Z- and H-eigenvalues could be computed ex¬ 
actly. 

Example 1 ( [38] ) A Hankel tensor A whose entries are defined as 

= sin(ii +i 2 ^ -him), = 1, 2 ,..., n, j = 1, 2,..., m. 

Its generating vector is \ = (sin(m), sin(m -hi),..., sin(mn))''". 

If m = 4 and n = 5, there are five Z-eigenvalues which are listed as follows IZ3I3/ 

Ai = 7.2595, As = 4.6408, A 3 = 0.0000, A 4 = -3.9204, A 5 = -8.8463. 


Table 1: Computed Z-eigenvalues of the Hankel tensor in Example [H 


Algorithms 

Power M. 

Han’s UOA 

ACSA-general 

ACSA-Hankel 

-8.846335 

54% 

58% 

72% 

72% 

-3.920428 

46% 

42% 

28% 

28% 

CPUt. (sec) 

23.09 

9.34 

8.39 

0.67 


We test four kinds of algorithms: power method, Han’s UOA, ACSA-general and ACSA- 
Hankel. For the purpose of obtaining the smallest Z-eigenvalue of the Hankel tensor, we 
select 100 random initial points on the unit sphere. The entries of each initial point is hrst 
chosen to have a Gaussian distribution, then we normalize it to a unit vector. The resulting 
Z-eigenvalues and CPU times are reported in Table [H All of the four methods hnd the 
smallest Z-eigenvalue —8.846335. But the occurrences for each method Ending the smallest 
Z-eigenvalue are different. We say that the ACS A algorithm proposed in Section 3 could 
find the extremal eigenvalues with a higher probability. 

Form the viewpoint of totally computational times, ACSA-general, and ACSA-Hankel 
are faster than the power method and Han’s UOA. When the Hankel structure of a fourth 
order five dimensional symmetric tensor A is exploited, it is unexpected that the new method 
is about 30 times faster than the power method. 

Example 2 We study a parameterized fourth order four dimensional Hankel tensor He 
whose generating vector has the following form 

V, = (8 - e, 0, 2, 0,1, 0,1, 0,1, 0, 2, 0, 8 - e)^. 

If e = 0, Ho is positive semidefinite but not positive definite m When the parameter e 
is positive and trends to zero, the smallest Z- and H-eigenvalues are negative and trends to 
zero. In this example, we will illustrate this phenomenon by a numerical approach. 
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Figure 1; The smallest Z- and H-eigenvalues of the parameterized fourth order four dimen¬ 
sional Hankel tensors. 

Table 2: CPU times (second) for computing Z- and H-eigenvalues of the parameterized 
Hankel tensors shown in Example [2l 


Algorithms 

Power M. 

Han’s UOA 

ACSA-general 

ACSA-Hankel 

Z-eigenvalues 

41.980 

46.629 

17.878 

1.498 

H-eigenvalues 

29.562 

45.833 

16.973 

1.544 

Total CPU times 

71.542 

92.462 

34.851 

3.042 


Again, we compare the power method, Han’s UOA, ACSA-general, and ACSA-Hankel 
for computing the smallest Z- and H-eigenvalues of the parameterized Hankel tensors in 
Example O For the purpose of accuracy, we slightly modify the setting TolX:l.e-12, 
TolFun: 1 .e-12 for Han’s UOA. In each case, thirty random initial points on a unit sphere 
are selected to obtain the smallest Z- or H-eigenvalues. When the parameter e decreases 
from 1 to 10“^°, the smallest Z- and H-eigenvalues returned by these four algorithm are 
congruent. We show this results in Figure [H When e trends to zero, the smallest Z- and 
H-eigenvalues are negative and going to zero too. 

The detailed CPU times for these four algorithms computing the smallest Z- and H- 
eigenvalues of the parameterized fourth order four dimensional Hankel tensors are drawn in 
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Table [2l Obviously, even without exploiting the Hankel structure, ACSA-general is two times 
faster than the power method and Han’s UOA. Furthermore, when the fast computational 
framework for the products of a Hankel tensor time vectors is explored, ACSA-Hankel saves 
about 90% CPU times. 


5.2 Large scale problems 

When the Hankel structure of higher order tensors is explored, we could compute eigenvalues 
and associated eigenvectors of large scale Hankel tensors. 


Example 3 A Vandermonde tensor 43, {5^ is a special Hankel tensor. Let 


a = 


n 

n — 1 


and 


f3 


1 — n 
n 


Then, Ui = (1, a, a^, and U2 = (1, (4, (4"^, ■ ■ ■, (4'^ are two Vandermonde vec¬ 

tors. The following mth order n dimensional symmetric tensor 


Hv = Ui (g) Ui (g) • • • (g) Ui + U2 0 U2 0 • • • (g) U2 

"-V-' '-V-' 

m times m times 

is a Vandermonde tensor which satisfies the Hankel structure. Here (g is the outer product. 
Obviously, the generating vector ofTLy is v = (2, a + /3,..., 


Proposition 1 Suppose the mth order n dimensional Hankel tensor Tiy is defined as in 
Example\^ Then, whenn is even, the largest Z-eigenvalue of Tiy is ||ui||™' and its associated 
eigenvector is 


l|Ul|| 


Proof Since afi = —1 and n is even, ui and U2 are orthogonal. We consider the optimization 
problem 

max Hy^^ = (u^x)'” + (ujx)™, 
s.t. x^x = 1. 

Since ||ui|| > ||u 2 ||, when x = the above optimization problem obtains its maximal 
value lluill’”. We write down its KKT condition, and it is easy to see that (||ui||"*, n^) is 
a Z-eigenpair of Hy. □ 

Now, we employ the proposed ACSA algorithm which works with the generating vector 
of a Hankel tensor to compute the largest Z-eigenvalue of the Vandermonde tensor dehned in 
Example [3l We consider different orders m = 4, 6, 8 and various dimension n = 10,..., 10®. 
For each case, we choose ten random initial points, which has a Gaussian distribution on 
a unit sphere. Table [3] shows the computed largest Z-eigenvalues and the associated CPU 
times. For all case, the resulting largest Z-eigenvalue is agree with Proposition [H When the 
dimension of the tensor is one million, the computational times for fourth order and sixth 
order Vandermonde tensors are about 35 and 55 minutes respectively. 
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Table 3: The largest Z-eigenvalues of Vandermonde tensor in Example [3l 


m 

n 

largest Z-eigenvalues 

Occurrences 

CPU times (sec.) 

4 

10 

9.487902e02 

8 

0.062 

4 

100 

1.013475e05 

8 

0.140 

4 

1,000 

1.019800e07 

7 

0.889 

4 

10,000 

1.020431e09 

8 

9.048 

4 

100,000 

1.020494ell 

10 

150.245 

4 

1,000,000 

1.020500el3 

5 

2066.592 

6 

10 

2.922505e04 

5 

0.140 

6 

100 

3.226409e07 

5 

0.234 

6 

1,000 

3.256659el0 

7 

1.919 

6 

10,000 

3.259683el3 

7 

17.753 

6 

100,000 

3.259985el6 

9 

211.537 

6 

1,000,000 

3.260016el9 

4 

3190.439 

8 

10 

9.002029e05 

5 

0.359 

8 

100 

1.027131el0 

5 

0.437 

8 

1,000 

1.039992el4 

7 

2.917 

8 

10,000 

1.041279el8 

7 

30.561 

8 

100,000 

1.041408e22 

8 

1058.248 


Example 4 An mth order n dimensional Hilbert tensor is defined as 


Hh 


1 

+ *2 H-h - m + 1 


ij = 1,2,... ,n,j = 1,2,... ,m. 


Its generating vector is \ = When the order m is even, the Hilbert 

tensors are positive definite. Its largest Z-eigenvalue and largest H-eigenvalues are bounded 
by sin - and sin - respectively. 

^ n n ^ ^ 


We illustrate by numerical experiments to show whether these bounds are tight? First, 
for the dimension varying from ten to one million, we calculate the theoretical upper bounds 
of the largest Z-eigenvalues of corresponding fourth order and sixth order Hilbert tensors. 
Then, for each Hilbert tensor, we choose ten initial points and employ the ACSA algorithm 
equipped with a fast computational framework for products of a Hankel tensor and vectors 
to compute the largest Z-eigenvalues. These results are shown in the left sub-hgure of Figure 
[2J The right sub-hgure of Figure [2] shows the corresponding CPU times for ACSA-Hankel. 
We can see that the theoretical upper bounds for the largest Z-eigenvalues of the Hilbert 
tensors are almost tight up to a constant multiple. 
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Fourth order Hilbert tensor 



n 


Fourth order Hilbert tensor 


10 10 
n 

Sixth order Hilbert tensor 



Figure 2: The largest Z-eigenvalue and its upper bound for Hilbert tensors. 


Fourth order Hilbert tensor 



n 


Fourth order Hilbert tensor 


10 10 
n 

Sixth order Hilbert tensor 



Figure 3: The computed largest H-eigenvalue and its upper bound for Hilbert tensors. 

Similar results for the largest H-eigenvalues and their theoretical upper bounds of Hilbert 
tensors are illustrated in Figure |3l 

6 Conclusion 

We proposed an inexact steepest descent method processing on a unit sphere for generalized 
eigenvalues and associated eigenvectors of Hankel tensors. Owing to the fast computation 
framework for the products of a Hankel tensor and vectors, the new algorithm is fast and 
efficient as shown by some preliminary numerical experiments. Since the Hankel structure 
is well-exploited, the new method could deal with some large scale Hankel tensors, whose 
dimension is up to one million in a desktop computer. 
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